feat(trace-exporter): add v1 span and its encoder#2039
Conversation
🎉 All green!🧪 All tests passed 🎯 Code Coverage (details) 🔗 Commit SHA: b22e805 | Docs | Datadog PR Page | Give us feedback! |
Artifact Size Benchmark Reportaarch64-alpine-linux-musl
aarch64-unknown-linux-gnu
libdatadog-x64-windows
libdatadog-x86-windows
x86_64-alpine-linux-musl
x86_64-unknown-linux-gnu
|
📚 Documentation Check Results📦
|
Clippy Allow Annotation ReportComparing clippy allow annotations between branches:
Summary by Rule
Annotation Counts by File
Annotation Stats by Crate
About This ReportThis report tracks Clippy allow annotations for specific rules, showing how they've changed in this PR. Decreasing the number of these annotations generally improves code quality. |
🔒 Cargo Deny Results📦
|
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #2039 +/- ##
==========================================
+ Coverage 73.44% 73.88% +0.44%
==========================================
Files 465 472 +7
Lines 77949 79572 +1623
==========================================
+ Hits 57248 58793 +1545
- Misses 20701 20779 +78
🚀 New features to boost your workflow:
|
…-encoder # Conflicts: # libdd-trace-utils/src/msgpack_encoder/v04/mod.rs # libdd-trace-utils/src/msgpack_encoder/v1/mod.rs
Tighten module-level docs, drop verbose design-doc references, and shorten per-item comments on the V1 data model and msgpack encoder to match the concise style used in v04.
6b434a7 to
53ae773
Compare
ajgajg1134
left a comment
There was a problem hiding this comment.
Overall this looks good from a v1 perspective but definitely someone with more Rust review experience should take a peek.
One small idea (probably not necessary for this specific PR), can we implement a round-trip fuzzer here to verify that if we start with a valid payload and encode -> decode it we end up with the same thing? Or something like that would help build confidence that we have consistent implementations
| //! Cross-validates that the M1 encoder (v0.4 spans → V1 payload) and the M3 encoder | ||
| //! (v1::Span → V1 payload) produce **byte-identical** output for equivalent inputs. | ||
| //! | ||
| //! All tests are limited to deterministic content (at most one attribute key per map) so the |
There was a problem hiding this comment.
Is there a way we could swap in (just for testing) a consistent ordered hash map to make these tests more versatile?
There was a problem hiding this comment.
It's do able but probably not worth it right now imo. We could do :
- A test-only feature flag that use BTreeMap/IndexMap instead of HashMap in the encoder's storage type.
- Or sort entries at encode time in both encoders.
One add complexity and the other one change production behavior for a temporary concern: once tracers switch to native span, and thus to full V1, the v0.4 to V1 encoder goes away and this whole cross-validation suite would go with it.
There was a problem hiding this comment.
I'm not comfortable with constraining collections to size n=1. There are potential edge cases we aren't catching.
I would question what the benefit of byte-identical tests are? It's not something that actually matters IRL (as evident by the n=1 constraint on collections in the first place). Wouldn't it be better for the tests to be structured so that the two payloads are decoded and verified to logically match? You could sort the decoded collections to assert they match.
There was a problem hiding this comment.
@ekump Here is the solution I implemented => replaced the byte-equality cross_validation_tests with a single integration test v04_and_v1_encoders_produce_equivalent_decoded_traces in tests/test_send_data.rs that :
- Builds equivalent v0.4 and v1::Span payloads
- Encodes both via their respective encoders
- POSTs both to ddapm-test-agent's
/v1.0/traces - The test-agent groups them by trace_id and we assert the two decoded spans are structurally equal after recursive key normalization
This is a real round-trip check (no n=1 constraint) and doubles as the "round-trip fuzzer" you suggested in your overall comment @ajgajg1134.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: c00fb1921a
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
What does this PR do?
Adds the canonical V1 trace payload data model (
v1::Span,TracerPayload,TraceChunk, typedAttributeValue,SpanKind, 128-bittrace_id,error: bool) and a matching msgpack encoder that serializes this model into the V1 wire format. New public APIs:to_vec_from_payload,to_vec_from_payload_with_capacity,write_payload_to_slice,to_encoded_byte_len_from_payload.Motivation
APMSP-2810, M3 of the V1 trace payload rollout. Tracers will eventually produce V1 payloads natively rather than going through the v0.4→V1 conversion path implemented in #1896, so libdatadog needs a canonical V1 data model and a corresponding encoder.
Additional Notes
SpanKey,SpanLinkKey,SpanEventKey,AnyValueKey, plus thetrace_keyandchunk_keymodules) live inmsgpack_encoder/v1/mod.rsand are shared between the v0.4→V1 encoder and the v1::Span encoder.